Canberra distance on ranked lists

نویسندگان

  • Giuseppe Jurman
  • Samantha Riccadonna
  • Roberto Visintainer
  • Cesare Furlanello
چکیده

The Canberra distance is the sum of absolute values of the differences between ranks divided by their sum, thus it is a weighted version of the L1 distance. As a metric on permutation groups, the Canberra distance is a measure of disarray for ranked lists, where rank differences in top positions need to pay higher penalties than movements in the bottom part of the lists. Here we describe the distance by assessing its main statistical properties and we show extensions to partial ranked lists. We conclude providing two examples of use in functional genomics.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Algebraic stability indicators for ranked lists in molecular profiling – Supplementary Material – Rev1

1 Permutation groups and distances 1 1.1 Distances 2 2 Equivalence of Borda list with best average position list 2 3 Metrics on partial lists 2 4 Properties of the harmonic numbers 2 5 Properties of the Canberra distance 4 5.1 Proof 5 5.2 The standard deviation indicator 5 6 Normalized indicators 5 7 Feature modules 5 8 Dataset shaving experiment 6 8.

متن کامل

Algebraic stability indicators for ranked lists in molecular profiling

MOTIVATION We propose a method for studying the stability of biomarker lists obtained from functional genomics studies. It is common to adopt resampling methods to tune and evaluate marker-based diagnostic and prognostic systems in order to prevent selection bias. Such caution promotes honest estimation of class prediction, but leads to alternative sets of solutions. In microarray studies, the ...

متن کامل

Comparing top-k XML lists

Systems that produce ranked lists of results are abundant. For instance, Web search engines return ranked lists of Web pages. There has been work on distance measure for list permutations, like Kendall tau and Spearman’s Footrule, as well as extensions to handle top-k lists, which are more common in practice. In addition to ranking whole objects (e.g., Web pages), there is an increasing number ...

متن کامل

An LSH Index for Computing Kendall's Tau over Top-k Lists

We consider the problem of similarity search within a set of top-k lists under the Kendall’s Tau distance function. This distance describes how related two rankings are in terms of concordantly and discordantly ordered items. As top-k lists are usually very short compared to the global domain of possible items to be ranked, creating an inverted index to look up overlapping lists is possible but...

متن کامل

A Study of Metrics of Distance and Correlation Between Ranked Lists for Compositionality Detection

Compositionality in language refers to how much the meaning of some phrase can be decomposed into the meaning of its constituents and the way these constituents are combined. Based on the premise that substitution by synonyms is meaning-preserving, compositionality can be approximated as the semantic similarity between a phrase and a version of that phrase where words have been replaced by thei...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009